Detecting Events in a Million New York Times Articles
نویسندگان
چکیده
We present a demonstration of a newly developed text stream event detection method on over a million articles from the New York Times corpus. The event detection is designed to operate in a predominantly on-line fashion, reporting new events within a specified timeframe. The event detection is achieved by detecting significant changes in the statistical properties of the text where those properties are efficiently stored and updated in a suffix tree. This particular demonstration shows how our method is effective at discovering both shortand long-term events (which are often denoted topics), and how it automatically copes with topic drift on a corpus of 1 035 263 articles.
منابع مشابه
Lean six sigma process improvement in specimen receiving to improve stat chemistry turnaround times
Objective: As a consequence of stat turnaround times (TATs) chronically exceeding 60 minutes, our laboratory was facing pressure to divert limited resources toward the implementation of an emergency department satellite laboratory. Peer-reviewed literature in clinical laboratory quality assurance and improvement indicates that between 60-70% of errors occur at the pre-analytical level. Thus...
متن کاملGlobal News and Awareness: An Examination of El País and The New York Times and their Relation to Public Knowledge and Opinion Levels
This research examined the content of articles about Brazil in The New York Times and El País and the correlation between news coverage and students’ knowledge and opinion levels on current events in Brazil. Content analysis revealed that American news coverage of Brazil in The New York Times had more depth and breadth than Spanish coverage in El País. Questionnaires distributed to University o...
متن کاملMetadiscourse Markers: A Contrastive Study of Translated and Non-Translated Persuasive Texts
Metadiscourse features are those facets of a text, which make the organization of the text explicit, provide information about the writer's attitude toward the text content, and engage the reader in the interaction. This study interpreted metadiscourse markers in translated and non-translated persuasive texts. To this end, the researcher chose the translated versions of one of the leading newsp...
متن کاملRecency is good: expanding with fresh news improves event detection in Twitter
Twitter is a popular microblogging site that is a good source of real-time information. Detecting events in Twitter is an ongoing research effort and a fundamental task is clustering tweets according to which (news) event they describe. Document expansion can improve this clustering, especially for Twitter, given that tweets are short. While document expansion using external corpora has been ar...
متن کاملThe Role of Culture in Sports Sponsorship: an Update
Nowadays sponsorship is an important part of sports events. Sports sponsorship offers more benefits, more variety and also it’s a more powerful form of marketing. In general, sponsorship holds a unique position in the marketing mix because it is effective in building brand awareness, provides different marketing platforms and valuable networking and hospitality opportunities. Sponsorship market...
متن کامل